Search CORE

46 research outputs found

A distance for probability spaces, and long-term values in Markov Decision Processes and Repeated Games

Author: Renault Jérôme
Venel Xavier
Publication venue
Publication date: 28/02/2012
Field of study

Given a finite set

K

, we denote by

X=\Delta(K)

the set of probabilities on

K

and by

Z=\Delta_f(X)

the set of Borel probabilities on

X

with finite support. Studying a Markov Decision Process with partial information on

K

naturally leads to a Markov Decision Process with full information on

X

. We introduce a new metric

d_*

Z

such that the transitions become 1-Lipschitz from

(X, \|.\|_1)

(Z,d_*)

. In the first part of the article, we define and prove several properties of the metric

d_*

. Especially,

d_*

satisfies a Kantorovich-Rubinstein type duality formula and can be characterized by using disintegrations. In the second part, we characterize the limit values in several classes of "compact non expansive" Markov Decision Processes. In particular we use the metric

d_*

to characterize the limit value in Partial Observation MDP with finitely many states and in Repeated Games with an informed controller with finite sets of states and actions. Moreover in each case we can prove the existence of a generalized notion of uniform value where we consider not only the Ces\`aro mean when the number of stages is large enough but any evaluation function

\theta \in \Delta(\N^*)

when the impatience

I(\theta)=\sum_{t\geq 1} |\theta_{t+1}-\theta_t|

is small enough

arXiv.org e-Print Archive

CiteSeerX

Recursive games: Uniform value, Tauberian theorem and the Mertens conjecture " $Maxmin=\lim v_n=\lim v_\lambda$ "

Author: Li Xiaoxi
Venel Xavier
Publication venue
Publication date: 02/06/2015
Field of study

We study two-player zero-sum recursive games with a countable state space and finite action spaces at each state. When the family of

n

-stage values

\{v_n,n\geq 1\}

is totally bounded for the uniform norm, we prove the existence of the uniform value. Together with a result in Rosenberg and Vieille (2000), we obtain a uniform Tauberian theorem for recursive games:

(v_n)

converges uniformly if and only if

(v_\lambda)

converges uniformly. We apply our main result to finite recursive games with signals (where players observe only signals on the state and on past actions). When the maximizer is more informed than the minimizer, we prove the Mertens conjecture

Maxmin=\lim_{n\to\infty} v_n=\lim_{\lambda\to 0}v_\lambda

. Finally, we deduce the existence of the uniform value in finite recursive game with symmetric information.Comment: 32 page

arXiv.org e-Print Archive

HAL-Paris1

HAL-Ecole des Ponts ParisTech

Existence de la valeur uniforme dans les jeux répétés

Author: Venel Xavier
Publication venue
Publication date: 12/07/2012
Field of study

Dans cette thèse, nous nous intéressons à un modèle général de jeux répétés à deux joueurs et à somme nulle et en particulier au problème de l’existence de la valeur uniforme. Un jeu répété a une valeur uniforme s’il existe un paiement que les deux joueurs peuvent garantir, dans tous les jeux commençant aujourd’hui et suffisamment longs, indépendamment de la longueur du jeu. Dans un premier chapitre, on étudie les cas d’un seul joueur, appelé processus de décision Markovien partiellement observable, et des jeux où un joueur est parfaitement informé et contrôle la transition. Il est connu que ces jeux admettent une valeur uniforme. En introduisant une nouvelle distance sur les probabilités sur le simplexe de Rm, on montre l’existence d’une notion plus forte où les joueurs garantissent le même paiement sur n’importe quel intervalle de temps suffisamment long et non pas uniquement sur ceux commençant aujourd’hui. Dans les deux chapitres suivants, on montre l’existence de la valeur uniforme dans deux cas particuliers de jeux répétés : les jeux commutatifs dans le noir, où les joueurs n’observent pas l'état mais l’état est indépendant de l’ordre dans lequel les actions sont jouées, et les jeux avec un contrôleur plus informé, où un joueur est plus informé que l’autre joueur et contrôle l'évolution de l'état. Dans le dernier chapitre, on étudie le lien entre la convergence uniforme des valeurs des jeux en n étapes et le comportement asymptotique des stratégies optimales dans ces jeux en n étapes. Pour chaque n, on considère le paiement garanti pendant ln étapes avec 0 < l < 1 par les stratégies optimales pour n étapes et le comportement asymptotique lorsque n tend vers l’infini.In this dissertation, we consider a general model of two-player zero-sum repeated game and particularly the problem of the existence of a uniform value. A repeated game has a uniform value if both players can guarantee the same payoff in all games beginning today and sufficiently long, independently of the length of the game. In a first chapter, we focus on the cases of one player, called Partial Observation Markov Decision Processes, and of Repeated Games where one player is perfectly informed and controls the transitions. It is known that these games have a uniform value. By introducing a new metric on the probabilities over a simplex in Rm, we show the existence of a stronger notion, where the players guarantee the same payoff on all sufficiently long intervals of stages and not uniquely on the one starting today. In the next two chapters, we show the existence of the uniform value in two special models of repeated games : commutative repeated games in the dark, where the players do not observe the state variable, but the state is independent of the order the actions are played, and repeated games with a more informed controller, where one player controls the transition and has more information than the second player. In the last chapter, we study the link between the uniform convergence of the value of the n-stage games and the asymptotic behavior of the sequence of optimal strategies in the n-stage game. For each n, we consider n-stage optimal strategies and the payoff they are guaranteeing during the ln first stages with 0 < l < 1. We study the asymptotic of this payoff when n goes to infinity

Toulouse Capitole Publications

Toulouse 1 Capitole Publications

Existence de la valeur uniforme dans les jeux répétés

Author: Venel Xavier
Publication venue
Publication date: 12/07/2012
Field of study

Toulouse Capitole Publications

Asymptotic Properties of Optimal Trajectories in Dynamic Programming

Author: Sorin Sylvain
Venel Xavier
Vigeral Guillaume
Publication venue
Publication date: 01/01/2010
Field of study

We prove in a dynamic programming framework that uniform convergence of the finite horizon values implies that asymptotically the average accumulated payoff is constant on optimal trajectories. We analyze and discuss several possible extensions to two-person games.Comment: 9 page

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

Hal-Diderot

HAL-Polytechnique

On finite-time ruin probabilities with reinsurance cycles influenced by large claims

Author: Mathieu Bargès
Stéphane Loisel
Xavier Venel
Publication venue
Publication date
Field of study

Market cycles play a great role in reinsurance. Cycle transitions are not independent from the claim arrival process : a large claim or a high number of claims may accelerate cycle transitions. To take this into account, a semi-Markovian risk model is proposed and analyzed. A refined Erlangization method is developed to compute the finite-time ruin probability of a reinsurance company. As this model needs the claim amounts to be Phase-type distributed, we explain how to fit mixtures of Erlang distributions to long-tailed distributions. Numerical applications and comparisons to results obtained from simulation methods are given. The impact of dependency between claim amounts and phase changes is studied.

Research Papers in Economics

A distance for probability spaces, and long-term values in Markov Decision Processes and Repeated Games

Author: Renault Jérôme
Venel Xavier
Publication venue: TSE Working Paper
Publication date: 01/01/2017
Field of study

Toulouse Capitole Publications

Weighted Average-convexity and Cooperative Games

Author: Skoda Alexandre
Venel Xavier
Publication venue
Publication date: 15/02/2023
Field of study

We generalize the notion of convexity and average-convexity to the notion of weighted average-convexity. We show several results on the relation between weighted average-convexity and cooperative games. First, we prove that if a game is weighted average-convex, then the corresponding weighted Shapley value is in the core. Second, we exhibit necessary conditions for a communication TU-game to preserve the weighted average-convexity. Finally, we provide a complete characterization when the underlying graph is a priority decreasing tree

arXiv.org e-Print Archive

On finite-time ruin probabilities with reinsurance cycles influenced by large claims

Author: Bargès Mathieu
Loisel Stéphane
Venel Xavier
Publication venue: Taylor & Francis (Routledge)
Publication date: 01/01/2011
Field of study

International audienceMarket cycles play a great role in reinsurance. Cycle transitions are not independent from the claim arrival process : a large claim or a high number of claims may accelerate cycle transitions. To take this into account, a semi-Markovian risk model is proposed and analyzed. A refined Erlangization method is developed to compute the finite-time ruin probability of a reinsurance company. As this model needs the claim amounts to be Phase-type distributed, we explain how to fit mixtures of Erlang distributions to long-tailed distributions. Numerical applications and comparisons to results obtained from simulation methods are given. The impact of dependency between claim amounts and phase changes is studied

Hal-Diderot

On values of repeated games with signals

Author: Gimbert Hugo
Renault Jérôme
Sorin Sylvain
Venel Xavier
Zielonka Wiesław
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2016
Field of study

We study the existence of different notions of value in two-person zero-sum repeated games where the state evolves and players receive signals. We provide some examples showing that the limsup value (and the uniform value) may not exist in general. Then we show the existence of the value for any Borel payoff function if the players observe a public signal including the actions played. We also prove two other positive results without assumptions on the signaling structure: the existence of the

\sup

value in any game and the existence of the uniform value in recursive games with nonnegative payoffs.Comment: Published at http://dx.doi.org/10.1214/14-AAP1095 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Toulouse Capitole Publications

Toulouse 1 Capitole Publications

Archivio della ricerca- LUISS Libera Università Internazionale degli Studi Sociali Guido Carli di Roma

HAL-Paris1

Hal-Diderot

HAL-Ecole des Ponts ParisTech